Differences in Time Delay between Search Engine Crawlers at Web Sites

نویسندگان

  • Jeeva Jose
  • P. Sojan Lal
چکیده

Web log mining provides tremendous information about user traffic and search engine behavior at web sites. The behavior of search engines could be used in analyzing server load, quality of search engines, dynamics of search engine crawlers, ethics of search engines etc. Search engine crawlers are highly automated programs which are seldom regulated manually. These crawlers periodically visit the web sites to collect the information. The dynamicity of search engine crawlers could be identified with the time delay between two consecutive visits. The more the visits of a crawler to a web site, the more it contributes to the server load. We intend to see whether there is a significant difference in the time delay between visits of a search engine crawler. Similarly the time delay between visits of various search engine crawlers is also analyzed to identify the differences in their behavior.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of the Temporal Behaviour of Search Engine Crawlers at Web Sites

Web log mining is the extraction of web logs to analyze user behaviour at web sites. In addition to user information, web logs provide immense information about search engine traffic and behaviour. Search engine crawlers are highly automated programs that periodically visit the web site to collect information. The behaviour of search engines could be used in analyzing server load, quality of se...

متن کامل

Mining Web Logs to Identify Search Engine Behaviour at Websites

Web Usage Mining also known as Web Log Mining is the extraction of user behaviour from web log data. The log files also provide immense information about the search engine traffic at a website. This search engine traffic is helpful to analyse the ethics of search engines, quality of the crawlers, periodicity of the visits and also the server load. Search engine crawlers are automated programs w...

متن کامل

Workload-Aware Web Crawling and Server Workload Detection

With the development of search engines, more and more web crawlers are used to gather web pages. The rising crawling traffic has brought the concern that crawlers may impact web sites. On the other hand, more efficient crawling strategy is required for the coverage and freshness of search engine index. In this paper, crawlers of several major search engines are analyzed using one six-months acc...

متن کامل

An investigation of web crawler behavior: characterization and metrics

In this paper, we present a characterization study of search-engine crawlers. For the purposes of our work, we use Web-server access logs from five academic sites in three different countries. Based on these logs, we analyze the activity of different crawlers that belong to five search engines: Google, AltaVista, Inktomi, FastSearch and CiteSeer. We compare crawler behavior to the characteristi...

متن کامل

Algorithm for Merging Search Interfaces over Hidden Web

This is the world of information. The size of world wide web [4,5] is growing at an exponential rate day by day. The information on the web is accessed through search engine. These search engines [8] uses web crawlers to prepare the repository and update that index at an regular interval. These web crawlers [3, 6] are the heart of search engines. Web crawlers continuously keep on crawling the w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013